Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Free, publicly-accessible full text available June 5, 2026
-
Multi-Agent Reinforcement Learning can be used to learn solutions for a wide variety of tasks, but there are few safety guarantees about the policies that the agents learn. My research addresses the challenge of ensuring safety in communication-free multi-agent environments, using shielding as the primary tool. We introduce methods to completely prevent safety violations in domains for which a model is available, in both fully observable and partially observable environments. We present ongoing research on maximizing safety in environments for which no model is available, utilizing a centralized training, decentralized execution framework, and discuss future lines of research.more » « lessFree, publicly-accessible full text available June 5, 2026
-
Shielding is an effective method for ensuring safety in multi-agent domains; however, its applicability has previously been limited to environments for which an approximate discrete model and safety specification are known in advance. We present a method for learning shields in cooperative fully-observable multi-agent environments where neither a model nor safety specification are provided, using architectural constraints to realize several important properties of a shield. We show through a series of experiments that our learned shielding method is effective at significantly reducing safety violations, while largely maintaining the ability of an underlying reinforcement learning agent to optimize for reward.more » « lessFree, publicly-accessible full text available May 19, 2026
-
Shielding is an effective method for ensuring safety in multi-agent domains; however, its applicability has previously been limited to environments for which an approximate discrete model and safety specification are known in advance. We present a method for learning shields in cooperative fully-observable multi-agent environments where neither a model nor safety specification are provided, using architectural constraints to realize several important properties of a shield. We show through a series of experiments that our learned shielding method is effective at significantly reducing safety violations, while largely maintaining the ability of an underlying reinforcement learning agent to optimize for reward.more » « lessFree, publicly-accessible full text available April 23, 2026
-
Learning safe solutions is an important but challenging problem in multi-agent reinforcement learning (MARL). Shielded reinforcement learning is one approach for preventing agents from choosing unsafe actions. Current shielded reinforcement learning methods for MARL make strong assumptions about communication and full observability. In this work, we extend the formalization of the shielded reinforcement learning problem to a decentralized multi-agent setting. We then present an algorithm for decomposition of a centralized shield, allowing shields to be used in such decentralized, communication-free environments. Our results show that agents equipped with decentralized shields perform comparably to agents with centralized shields in several tasks, allowing shielding to be used in environments with decentralized training and execution for the first time.more » « less
An official website of the United States government

Full Text Available